Using Distributed Query Result Caching to Evaluate Queries for Parallel Data Mining Algorithms
نویسندگان
چکیده
An increase in the speed of data mining algorithms can be achieved by improving the efciency of the underlying technologies. Query engines are key components in many knowledge discovery systems and the appropriate use of query engines can impact the performance of data mining algorithms. By taking advantage of hypothesis generation patterns, queries, generated from the hypotheses, can be evaluated more e ciently. Caching query results and using the cached results to evaluate new queries with similar constraints reduces the complexity of query evaluation and improves the performance of data mining algorithms. In a multi-processor environment, distributing the query result caches can improve the performance of parallel query evaluations. This idea has been used in the ParDRI system and has resulted in signi cant improvements in the execution times of ParDRI.
منابع مشابه
بهبود الگوریتم انتخاب دید در پایگاه داده تحلیلی با استفاده از یافتن پرس وجوهای پرتکرار
A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...
متن کاملMultiple query scheduling for distributed semantic caches
In distributed query processing systems, load balancing plays an important role in maximizing system throughput. When queries can leverage cached intermediate results, improving the cache hit ratio becomes as important as load balancing in query scheduling, especially when dealing with computationally expensive queries. The scheduling policies must be designed to take into consideration the dyn...
متن کاملParallel Visual Information Retrieval in VizIR
This paper describes how parallel retrieval is implemented in the content-based visual information retrieval framework VizIR. Generally, two major use cases for parallelisation exist in visual retrieval systems: distributed querying and simultaneous multi-user querying. Distributed querying includes parallel query execution and querying multiple databases. Content-based querying is a two-step p...
متن کاملSeparating indexes from data: a distributed scheme for secure database outsourcing
Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998